Add telemetry documentation#8133
Conversation
Document the end-to-end telemetry system across azure-dev, azd-queries, and azure-dev-tools. Covers architecture, data reference, feature instrumentation guide, dashboards/reports, and product overview. Partially contributes to Azure/azure-dev-pr#1772. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add service target values (including containerapp-dotnet, ai.endpoint) - Add service language values - Add feature-to-telemetry mapping table for outside-in lookup Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep docs focused on describing the system, not tracking issues. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move env.name row out of Service Languages table (wrong column count) - Fix Feature Mapping: auth.login.method → auth.method (actual field) - Fix Feature Mapping: project.infra.type → infra.provider - Fix Feature Mapping: packaging.type → pack.builder.image/tag - Fix Feature Mapping: update.availableVersion → fromVersion/toVersion - Fix Feature Mapping: ExecutionEnvironment → execution.environment - Fix KQL examples: MachineId → Properties['machine.id'] - Fix PII claim: no PII → no direct PII, sensitive values are hashed Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a comprehensive set of telemetry documentation pages covering architecture, data schema, dashboards, instrumentation guidance, and a product-facing overview, and wires them into the docs index plus cspell allowlists.
Changes:
- New telemetry docs: architecture, data reference, dashboards, feature instrumentation guide, and product overview.
- Updated
docs/README.mdto link the new docs in Guides/Reference/Architecture sections. - Extended
.vscode/cspell.misc.yamlwith terminology used by the new docs.
Show a summary per file
| File | Description |
|---|---|
| docs/architecture/telemetry.md | New end-to-end architecture doc with diagrams and pipeline detail. |
| docs/reference/telemetry-data.md | New schema reference for events, fields, ResultCode taxonomy, and KQL patterns. |
| docs/reference/telemetry-dashboards.md | New reference for Kusto functions, Power BI reports, and analysis layout. |
| docs/guides/feature-telemetry.md | New step-by-step instrumentation guide for new features. |
| docs/guides/telemetry-overview.md | New product-facing overview of telemetry metrics and dashboards. |
| docs/README.md | Adds links to the new telemetry docs. |
| .vscode/cspell.misc.yaml | Adds terms used in the new docs to the spell-check allowlist. |
Copilot's findings
Comments suppressed due to low confidence (1)
docs/guides/feature-telemetry.md:1
- Trailing whitespace at the end of line 184 (after
fields/). Minor formatting cleanup.
# Feature Telemetry Guide — Adding Telemetry to New Features
- Files reviewed: 7/7 changed files
- Comments generated: 10
jongio
left a comment
There was a problem hiding this comment.
Summary
Solid docs addition that'll help contributors understand the telemetry stack end-to-end. Three code references don't match what's currently in the codebase, and one of them could mislead users trying to opt out of telemetry.
Findings
| # | Severity | File | Issue |
|---|---|---|---|
| 1 | 🔴 high | telemetry-overview.md |
Opt-out config path defaults.collectTelemetry doesn't exist in code |
| 2 | 🟡 medium | telemetry-data.md |
ext.upgrade and ext.promote events aren't defined in events.go |
| 3 | 🟡 medium | feature-telemetry.md |
azdext.CommandResult type doesn't exist; extensions use ReportError() |
| 4 | 🟣 question | multiple | Internal infrastructure details (Kusto cluster, LENS job IDs, internal repo URLs) in a public repo |
hemarina
left a comment
There was a problem hiding this comment.
Really nice work pulling this together — the end-to-end picture across all three repos is super useful, and the structure is clean. 🙂
+1 to all the findings already raised by @jongio and the Copilot bot (config-path opt-out, undefined ext.upgrade/ext.promote, azdext.CommandResult, the KQL syntax issues, and the questions around internal infra references). I cross-checked them against cli/azd/ and they''re all real — worth addressing before merge.
Adding what I verified is solid so you don''t need to re-check those areas:
Things I checked and they''re all good ✅
- File paths in
architecture/telemetry.mdall exist:cli/azd/cmd/middleware/telemetry.go,internal/telemetry/{telemetry.go,storage.go,uploader.go},appinsights-exporter/span_to_envelope.go,internal/tracing/events/events.go,internal/tracing/fields/fields.go. ~/.azd/telemetry/*.trn,upload.lock,~/.azd/first-run,maxRetryCount=3,itemFileMaxTimeKept— all match the code.azd telemetry uploadis indeed a hidden subcommand invoked as a deferred background subprocess.StringHasheduses SHA-256 over lowercased input — the privacy claims about hashed fields (project.template.id,project.name,env.name) line up withinternal/tracing/fields/key.go.- VS Code opt-out via
telemetry.telemetryLevel=offmatches the extension. - Doc placement follows
.github/instructions/documentation.instructions.md. - All five new links in
docs/README.mdresolve. - No glossary collisions with
docs/concepts/glossary.md. - The mermaid in the PR body matches what''s in
architecture/telemetry.md.
One tiny nit (totally optional)
.vscode/cspell.misc.yaml: file-scoped overrides are the right call per AGENTS guidance. Just a thought for later — if terms like Kusto, Entra, dcount, tostring keep showing up across new telemetry docs, promoting them to the global word list down the line would save some churn. Fine as-is for this PR.
TL;DR
Once the items raised by jongio and the bot are addressed, I think this is in great shape. Thanks for taking the time to write all of this down — it''s going to make onboarding way easier. 🙏
- Remove internal repo references (azd-queries, azure-dev-tools) - Remove Kusto cluster/database details, GDPR pipeline, Power BI refs - Remove telemetry-dashboards.md (100% internal) - Fix azdext.CommandResult example (use ServiceError directly) - Fix defaults.collectTelemetry config path (doesn't exist) - Disambiguate service.name field (app-level vs service call) - Rewrite query examples as illustrative (no internal function deps) - Trim cspell overrides for removed internal terms - Internal docs will be maintained in Azure/azure-dev-pr Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thanks Marina! All findings addressed:
Good point on promoting common terms to global list later — will do once we see the pattern. |
wbreza
left a comment
There was a problem hiding this comment.
Reviewed the latest commit (c3763c56 — internal/public split). Verified author's responses to all prior reviewer comments against the source tree — events.go, fields.go, framework_service.go, telemetry.go all check out, and every referenced file path exists.
The split into companion PR azure-dev-pr#1787 is the right call and cleans the public surface up nicely.
Three non-blocking findings — all about long-term doc maintainability, none are merge blockers.
🟡 medium — Overlap with existing docs/guides/observability.md; no cross-link
docs/guides/observability.md already covers tracing architecture, Trace/Span/Attribute concepts, event-name prefixes (cmd./vsrpc./mcp.), and "Adding an Attribute / Adding a Span" — directly overlapping with feature-telemetry.md and architecture/telemetry.md. The new docs never link to it, and observability.md isn''t updated to point at the new docs.
Per .github/instructions/documentation.instructions.md: "Link to detailed implementation docs … rather than duplicating content." This is real drift risk — when the tracing API or event prefixes evolve, three guides will need to stay in sync.
Pick one:
- Add "See also: Observability and Tracing" links from the new docs, plus a reciprocal pointer from
observability.mdto the new feature/data/architecture docs, OR - Slim
observability.mddown to a trace-debugging guide (Jaeger,--trace-log-file,--trace-log-url) and defer to the new docs for instrumentation/architecture.
🔵 low — Two sources of truth for the schema
feature-telemetry.md Step 4 calls docs/specs/metrics-audit/telemetry-schema.md the "canonical schema, source of truth for privacy audits," but docs/reference/telemetry-data.md also enumerates the full event/field catalog with classifications. No statement of which is authoritative or which to update first.
Suggested fix: add a one-line note in telemetry-data.md pointing at telemetry-schema.md as authoritative, and have Step 4 of feature-telemetry.md require updating both (or call out which one is generated from which).
⚪ nit — Glossary not updated
docs/concepts/glossary.md has no entries for span, attribute, trace, event, classification, or OpenTelemetry, despite the new docs leaning heavily on these terms. Repo conventions in .github/instructions/documentation.instructions.md call out adding new concepts there. Several of these are already defined inline in observability.md and could be lifted into the glossary.
Easy to defer to a follow-up if you''d rather keep this PR scoped.
Confirmed (no action needed)
service.errorCodetyped asmeasurement— verifiedIsMeasurement: trueinfields.go:660-665js/tslanguage values — verifiedframework_service.go:22-23ext.upgrade/ext.promoteevent constants — verifiedevents.go:32,34cspell.misc.yamladditions scoped correctly underoverrides:per repo convention
Telemetry Documentation (Public)
Adds telemetry docs covering architecture, data schema, instrumentation guidance, and product overview.
What changed
Based on review feedback (Jeffrey, Jon, Victor, Marina), internal content was split out to a companion PR in azure-dev-pr:
Companion PR: https://github.com/Azure/azure-dev-pr/pull/1787
Files
docs/architecture/telemetry.mddocs/reference/telemetry-data.mddocs/guides/feature-telemetry.mddocs/guides/telemetry-overview.mddocs/README.md.vscode/cspell.misc.yamldocs/reference/telemetry-dashboards.mdReview feedback addressed
defaults.collectTelemetry→ fixed toAZURE_DEV_COLLECT_TELEMETRY=noonlyazdext.CommandResult→ fixed toazdext.ServiceErrorext.upgrade/ext.promote— verified they exist in code, kept